AITopics

Country: North America > United States > Washington > Pierce County > Tacoma (0.50)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Neural Information Processing SystemsMay-25-2025, 10:21:59 GMT

Customized Multiple Clustering via Multi-Modal Subspace Proxy Learning Qi Qian 2 School of Engineering and Technology, University of Washington, Tacoma, WA98402, USA

Multiple clustering aims to discover various latent structures of data from different aspects. Deep multiple clustering methods have achieved remarkable performance by exploiting complex patterns and relationships in data. However, existing works struggle to flexibly adapt to diverse user-specific needs in data grouping, which may require manual understanding of each clustering. To address these limitations, we introduce Multi-Sub, a novel end-to-end multiple clustering approach that incorporates a multi-modal subspace proxy learning framework in this work. Utilizing the synergistic capabilities of CLIP and GPT-4, Multi-Sub aligns textual prompts expressing user preferences with their corresponding visual representations. This is achieved by automatically generating proxy words from large language models that act as subspace bases, thus allowing for the customized representation of data in terms specific to the user's interests. Our method consistently outperforms existing baselines across a broad set of datasets in visual multiple clustering tasks.

large language model, machine learning, natural language, (21 more...)

Country: North America > United States > Washington > Pierce County > Tacoma (0.50)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Neural Information Processing SystemsMay-23-2025, 13:28:16 GMT

Intra-Modal Proxy Learning for Zero-Shot Visual Categorization with CLIP Qi Qian 1 Alibaba Group, Bellevue, WA98004, USA

Vision-language pre-training methods, e.g., CLIP, demonstrate an impressive zeroshot performance on visual categorizations with the class proxy from the text embedding of the class name. However, the modality gap between the text and vision space can result in a sub-optimal performance. We theoretically show that the gap cannot be reduced sufficiently by minimizing the contrastive loss in CLIP and the optimal proxy for vision tasks may reside only in the vision space. Therefore, given unlabeled target vision data, we propose to learn the vision proxy directly with the help from the text proxy for zero-shot transfer. Moreover, according to our theoretical analysis, strategies are developed to further refine the pseudo label obtained by the text proxy to facilitate the intra-modal proxy learning (InMaP) for vision. Experiments on extensive downstream tasks confirm the effectiveness and efficiency of our proposal. Concretely, InMaP can obtain the vision proxy within one minute on a single GPU while improving the zero-shot accuracy from 77.02% to 80.21% on ImageNet with ViT-L/14@336 pre-trained by CLIP.

large language model, machine learning, proxy, (19 more...)

Country: North America > United States > Washington > Pierce County > Tacoma (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceJan-30-2025

BounTCHA: A CAPTCHA Utilizing Boundary Identification in AI-extended Videos

Lin, Lehao, Wang, Ke, Abdallah, Maha, Cai, Wei

In recent years, the rapid development of artificial intelligence (AI) especially multi-modal Large Language Models (MLLMs), has enabled it to understand text, images, videos, and other multimedia data, allowing AI systems to execute various tasks based on human-provided prompts. However, AI-powered bots have increasingly been able to bypass most existing CAPTCHA systems, posing significant security threats to web applications. This makes the design of new CAPTCHA mechanisms an urgent priority. We observe that humans are highly sensitive to shifts and abrupt changes in videos, while current AI systems still struggle to comprehend and respond to such situations effectively. Based on this observation, we design and implement BounTCHA, a CAPTCHA mechanism that leverages human perception of boundaries in video transitions and disruptions. By utilizing AI's capability to expand original videos with prompts, we introduce unexpected twists and changes to create a pipeline for generating short videos for CAPTCHA purposes. We develop a prototype and conduct experiments to collect data on humans' time biases in boundary identification. This data serves as a basis for distinguishing between human users and bots. Additionally, we perform a detailed security analysis of BounTCHA, demonstrating its resilience against various types of attacks. We hope that BounTCHA will act as a robust defense, safeguarding millions of web applications in the AI-driven era.

large language model, machine learning, natural language, (16 more...)

2501.18565

Country:

Asia > South Korea (0.46)
Asia > China > Guangdong Province (0.14)
North America > United States > Washington > Pierce County > Tacoma (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Neural Information Processing SystemsJan-26-2025, 06:02:05 GMT

Privacy-Preserving Classification of Personal Text Messages with Secure Multi-Party Computation

Devin Reich, Ariel Todoki, Rafael Dowsley, Martine De Cock, anderson nascimento

Classification of personal text messages has many useful applications in surveillance, e-commerce, and mental health care, to name a few. Giving applications access to personal texts can easily lead to (un)intentional privacy violations. We propose the first privacy-preserving solution for text classification that is provably secure. Our method, which is based on Secure Multiparty Computation (SMC), encompasses both feature extraction from texts, and subsequent classification with logistic regression and tree ensembles. We prove that when using our secure text classification method, the application does not learn anything about the text, and the author of the text does not learn anything about the text classification model used by the application beyond what is given by the classification result itself. We perform end-to-end experiments with an application for detecting hate speech against women and immigrants, demonstrating excellent runtime results without loss of accuracy.

data mining, machine learning, natural language, (19 more...)

Country: North America > United States > Washington > Pierce County > Tacoma (0.14)

Genre: Research Report > New Finding (0.89)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.88)
Law Enforcement & Public Safety (0.88)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.66)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)

arXiv.org Artificial IntelligenceNov-6-2024

Customized Multiple Clustering via Multi-Modal Subspace Proxy Learning

Yao, Jiawei, Qian, Qi, Hu, Juhua

Multiple clustering aims to discover various latent structures of data from different aspects. Deep multiple clustering methods have achieved remarkable performance by exploiting complex patterns and relationships in data. However, existing works struggle to flexibly adapt to diverse user-specific needs in data grouping, which may require manual understanding of each clustering. To address these limitations, we introduce Multi-Sub, a novel end-to-end multiple clustering approach that incorporates a multi-modal subspace proxy learning framework in this work. Utilizing the synergistic capabilities of CLIP and GPT-4, Multi-Sub aligns textual prompts expressing user preferences with their corresponding visual representations. This is achieved by automatically generating proxy words from large language models that act as subspace bases, thus allowing for the customized representation of data in terms specific to the user's interests. Our method consistently outperforms existing baselines across a broad set of datasets in visual multiple clustering tasks.

large language model, machine learning, natural language, (20 more...)

2411.03978

Country: North America > United States > Washington > Pierce County > Tacoma (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

arXiv.org Artificial IntelligenceJan-7-2024

Bridging the Skills Gap: Evaluating an AI-Assisted Provider Platform to Support Care Providers with Empathetic Delivery of Protocolized Therapy

Kearns, William R., Bertram, Jessica, Divina, Myra, Kemp, Lauren, Wang, Yinzhou, Marin, Alex, Cohen, Trevor, Yuwen, Weichao

Despite the high prevalence and burden of mental health conditions, there is a global shortage of mental health providers. Artificial Intelligence (AI) methods have been proposed as a way to address this shortage, by supporting providers with less extensive training as they deliver care. To this end, we developed the AI-Assisted Provider Platform (A2P2), a text-based virtual therapy interface that includes a response suggestion feature, which supports providers in delivering protocolized therapies empathetically. We studied providers with and without expertise in mental health treatment delivering a therapy session using the platform with (intervention) and without (control) AI-assistance features. Upon evaluation, the AI-assisted system significantly decreased response times by 29.34% (p=0.002), Both groups rated the system as having excellent usability. Introduction Mental health conditions are highly prevalent and exert a considerable burden on society, with a global estimated cost of 125.3 million disability-adjusted life years in 2019 The utility of empathy-related AI support for provider selection of professionally crafted text messages remains unevaluated and is the focus of the current work. Response retrieval has several benefits over generation including the safety and controllability of the responses.

artificial intelligence, machine learning, natural language, (17 more...)

2401.03631

Country:

North America > United States > Washington > Pierce County > Tacoma (0.14)
North America > United States > Washington > King County > Seattle (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.69)

Nasr, Milad, Carlini, Nicholas, Hayase, Jonathan, Jagielski, Matthew, Cooper, A. Feder, Ippolito, Daphne, Choquette-Choo, Christopher A., Wallace, Eric, Tramèr, Florian, Lee, Katherine

Scalable Extraction of Training Data from (Production) Language Models

arXiv.org Artificial IntelligenceNov-28-2023

This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset. We show an adversary can extract gigabytes of training data from open-source language models like Pythia or GPT-Neo, semi-open models like LLaMA or Falcon, and closed models like ChatGPT. Existing techniques from the literature suffice to attack unaligned models; in order to attack the aligned ChatGPT, we develop a new divergence attack that causes the model to diverge from its chatbot-style generations and emit training data at a rate 150x higher than when behaving properly. Our methods show practical attacks can recover far more data than previously thought, and reveal that current alignment techniques do not eliminate memorization.

large language model, machine learning, natural language, (22 more...)

2311.17035

Country:

South America (1.00)
North America > United States > California (1.00)
Asia > Middle East (1.00)
(7 more...)

Genre:

Personal (1.00)
Research Report > New Finding (0.92)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Media > Television (1.00)
(26 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-30-2023

Intra-Modal Proxy Learning for Zero-Shot Visual Categorization with CLIP

Qian, Qi, Xu, Yuanhong, Hu, Juhua

Vision-language pre-training methods, e.g., CLIP, demonstrate an impressive zero-shot performance on visual categorizations with the class proxy from the text embedding of the class name. However, the modality gap between the text and vision space can result in a sub-optimal performance. We theoretically show that the gap cannot be reduced sufficiently by minimizing the contrastive loss in CLIP and the optimal proxy for vision tasks may reside only in the vision space. Therefore, given unlabeled target vision data, we propose to learn the vision proxy directly with the help from the text proxy for zero-shot transfer. Moreover, according to our theoretical analysis, strategies are developed to further refine the pseudo label obtained by the text proxy to facilitate the intra-modal proxy learning (InMaP) for vision. Experiments on extensive downstream tasks confirm the effectiveness and efficiency of our proposal. Concretely, InMaP can obtain the vision proxy within one minute on a single GPU while improving the zero-shot accuracy from $77.02\%$ to $80.21\%$ on ImageNet with ViT-L/14@336 pre-trained by CLIP. Code is available at \url{https://github.com/idstcv/InMaP}.

large language model, machine learning, proxy, (20 more...)

2310.19752

Country: North America > United States > Washington > Pierce County > Tacoma (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceApr-13-2023

Multi-Subset Approach to Early Sepsis Prediction

Ewig, Kevin, Lin, Xiangwen, Stewart, Tucker, Stern, Katherine, O'Keefe, Grant, Teredesai, Ankur, Hu, Juhua

Sepsis is a life-threatening organ malfunction caused by the host's inability to fight infection, which can lead to death without proper and immediate treatment. Therefore, early diagnosis and medical treatment of sepsis in critically ill populations at high risk for sepsis and sepsis-associated mortality are vital to providing the patient with rapid therapy. Studies show that advancing sepsis detection by 6 hours leads to earlier administration of antibiotics, which is associated with improved mortality. However, clinical scores like Sequential Organ Failure Assessment (SOFA) are not applicable for early prediction, while machine learning algorithms can help capture the progressing pattern for early prediction. Therefore, we aim to develop a machine learning algorithm that predicts sepsis onset 6 hours before it is suspected clinically. Although some machine learning algorithms have been applied to sepsis prediction, many of them did not consider the fact that six hours is not a small gap. To overcome this big gap challenge, we explore a multi-subset approach in which the likelihood of sepsis occurring earlier than 6 hours is output from a previous subset and feed to the target subset as additional features. Moreover, we use the hourly sampled data like vital signs in an observation window to derive a temporal change trend to further assist, which however is often ignored by previous studies. Our empirical study shows that both the multi-subset approach to alleviating the 6-hour gap and the added temporal trend features can help improve the performance of sepsis-related early prediction.

artificial intelligence, machine learning, prediction, (16 more...)

2304.06384

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Washington > Pierce County > Tacoma (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)